Identification of Protein Motifs Using Conserved Amino Acid Properties and Partitioning Techniques

نویسندگان

  • Thomas D. Wu
  • Douglas L. Brutlag
چکیده

Analyzing a set of protein sequences involves a fundamental relationship between the coherency of the set and the specificity of the motif that describes it. Motifs may be obscured by training sets that contain incoherent sequences, in part due to protein subclasses, contamination, or errors. We develop an algorithm for motif identification that systematically explores possible patterns of coherency within a set of protein sequences. Our algorithm constructs alternative partitions of the training set data, where one subset of each partition is presumed to contain coherent data and is used for forming a motif. The motif is represented by multiple overlapping amino acid groups based on evolutionary, biochemical, or physical properties. We demonstrate our method on a training set of reverse transcriptases that contains subclasses, sequence errors, misalignments, and contaminating sequences. Despite these complications, our program identifies a novel motif for the subclass of retroviral and retrovirus-related reverse transcriptases. This motif has a much higher specificity than previously reported motifs and suggests the importance of conserved hydrophilic and hydrophobic residues in the structure of reverse transcriptases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Novel Mutations in IL-2 Gene in Khorasan Native Fowls

The intron-exon structure of Khorasan native fowl interleukin-2 (IL-2) was investigated. For this purpose, twenty chickens were selected from the Native Fowl Breeding Station of Khorasan province, and genomic DNA was extracted using a modified conventional DNA extraction protocol. An 875 bp fragment of IL-2 was successfully amplified, including a small part of the promoter, exon 1, intron 1, an...

متن کامل

In silico investigation of lactoferrin protein characterizations for the prediction of anti-microbial properties

Lactoferrin (Lf) is an iron-binding multi-functional glycoprotein which has numerous physiological functions such as iron transportation, anti-microbial activity and immune response. In this study, different in silico approaches were exploited to investigate Lf protein properties in a number of mammalian species. Results showed that the iron-binding site, DNA and RNA-binding sites, signal pepti...

متن کامل

Designing Of Degenerate Primers-Based Polymerase Chain Reaction (PCR) For Amplification Of WD40 Repeat-Containing Proteins Using Local Allignment Search Method

Degenerate primers-based polymerase chain reaction (PCR) are commonly used for isolation of unidentified gene sequences in related organisms. For designing the degenerate primers, we propose the use of local alignment search method for searching the conserved regions long enough to design an acceptable primer pair. To test this method, a WD40 repeat-containing domain protein from Beauveria bass...

متن کامل

In Silico Analysis of Glutaminase from Different Species of Escherichia and Bacillus

Background: Glutaminase (EC 3.5.1.2) catalyzes the hydrolytic degradation of L-glutamine to L-glutamic acid and has been introduced for cancer therapy in recent years. The present study was an in silico analysis of glutaminase to further elucidate its structure and physicochemical properties.Methods: Forty glutaminase protein sequences from different species of Escherichia and Bacillus obtained...

متن کامل

In silico comparison of Iranian HIV -1 envelop glycoprotein with five nearby countries

HIV-1 envelope (env) glycoprotein mediates an important role in entry of the virus into the susceptible target cells. As env glycoprotein of HIV-1 is highly variable in the different geographical regions, in the present study, different properties of this protein in Iran are compared with five nearby countries. The sequences of HIV-1 env glycoproteins of Iran, Afghanistan, Russia, Turkey, Pakis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 3  شماره 

صفحات  -

تاریخ انتشار 1995